Parallelizing MCMC with Random Partition Trees
نویسندگان
چکیده
The modern scale of data has brought new challenges to Bayesian inference. In particular, conventional MCMC algorithms are computationally very expensive for large data sets. A promising approach to solve this problem is embarrassingly parallel MCMC (EP-MCMC), which first partitions the data into multiple subsets and runs independent sampling algorithms on each subset. The subset posterior draws are then aggregated via some combining rules to obtain the final approximation. Existing EP-MCMC algorithms are limited by approximation accuracy and difficulty in resampling. In this article, we propose a new EP-MCMC algorithm PART that solves these problems. The new algorithm applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to resample from and can adapt to multiple scales. We provide theoretical justification and extensive experiments illustrating empirical performance.
منابع مشابه
Information theory tools to rank MCMC algorithms on probabilistic graphical models
We propose efficient MCMC tree samplers for random fields and factor graphs. Our tree sampling approach combines elements of Monte Carlo simulation as well as exact belief propagation. It requires that the graph be partitioned into trees first. The partition can be generated by hand or automatically using a greedy graph algorithm. The tree partitions allow us to perform exact inference on each ...
متن کاملTree-Guided MCMC Inference for Normalized Random Measure Mixture Models
Normalized random measures (NRMs) provide a broad class of discrete random measures that are often used as priors for Bayesian nonparametric models. Dirichlet process is a well-known example of NRMs. Most of posterior inference methods for NRM mixture models rely on MCMC methods since they are easy to implement and their convergence is well studied. However, MCMC often suffers from slow converg...
متن کاملBayesian Learning in Undirected Graphical Models: Approximate MCMC Algorithms
Bayesian learning in undirected graphical models—computing posterior distributions over parameters and predictive quantities— is exceptionally difficult. We conjecture that for general undirected models, there are no tractable MCMC (Markov Chain Monte Carlo) schemes giving the correct equilibrium distribution over parameters. While this intractability, due to the partition function, is familiar...
متن کاملA Parallel Solution to the Extended Set Union Problem with Unlimited Backtracking
In this paper, we study on the EREW-PRAM model a parallel solution to the extended set union problem with unlimited backtracking which maintains a dynamic partition Π of an n-element set S subject to the usual operations Find, Union, Backtrack and Restore as well as the new operations SetUnion, MultiUnion. The SetUnion operation is a special case of the commonly known Union operation aimed to u...
متن کاملWhich Spatial Partition Trees are Adaptive to Intrinsic Dimension?
Recent theory work has found that a special type of spatial partition tree – called a random projection tree – is adaptive to the intrinsic dimension of the data from which it is built. Here we examine this same question, with a combination of theory and experiments, for a broader class of trees that includes k-d trees, dyadic trees, and PCA trees. Our motivation is to get a feel for (i) the ki...
متن کامل